Dynamic Language Model Adaptation usin
نویسنده
چکیده
We propose an unsupervised dynamic language model (LM) adaptation framework using long-distance latent topic mixtures. The framework employs the Latent Dirichlet Allocation model (LDA) which models the latent topics of a document collection in an unsupervised and Bayesian fashion. In the LDA model, each word is modeled as a mixture of latent topics. Varying topics within a context can be modeled by re-sampling the mixture weights of the latent topics from a prior Dirichlet distribution. The model can be trained using the variational Bayes Expectation Maximization algorithm. During decoding, mixture weights of the latent topics are adapted dynamically using the hypotheses of previously decoded utterances. In our work, the LDAmodel is combined with the trigram language model using linear interpolation. We evaluated the approach on the CCTV episode of the RT04 Mandarin Broadcast News test set. Results show that the proposed approach reduces the perplexity by up to 15.4% relative and the character error rate by 4.9% relative depending on the size and setup of the training set.
منابع مشابه
Dynamic Language Model Adaptation Using Latent Topical Information and Automatic Transcripts
This paper investigates dynamic language model adaptation for Mandarin broadcast news recognition. A topical mixture model was presented to dynamically explore the long−span latent topical information for language model adaptation. The underlying characteristics and different kinds of model structures were extensively investigated, while their performance was verified by comparison with the con...
متن کاملImproving out-of-coverage lang multimodal dialogue system usin
For automatic speech recognition, the construction of an adequate language model may be difficult when only a limited amount of training text is available. Previous work has shown that in the case of small training sets statistical language models may outperform grammars on out-of-coverage utterances, while showing comparable performance on incoverage input. In this paper, we compare the perfor...
متن کاملRobust Speech Recognition Usin Intra-speaker Ada
Inter-speaker variation can be coped rather well in speech recognition by speaker adaptation techniques such as MLLR and MAP. However, when dealing with speech other than reading style, such as conversational speech, emotional speech and so on, current recognition systems cannot achieve a satisfactory performance even after speaker adaptation. In view of this situation, two-level adaptation met...
متن کاملN-gram Adaptation with Dynamic Interpolation Coefficient Using Information Retrieval Technique
This study presents an N-gram adaptation technique when additional text data for the adaptation do not exist. We use a language modeling approach to the information retrieval (IR) technique to collect the appropriate adaptation corpus from baseline text data. We propose to use a dynamic interpolation coefficient to merge the N-gram, where the interpolation coefficient is estimated from the word...
متن کاملPersian Adaptation of Enhanced Milieu Teaching for Iranian Children With Expressive Language Delay
Objectives: This study aimed at adapting and examining the applicability of the Teach-Model-Coach-Review model of the enhanced milieu teaching (EMT) approach for improving Iranian mothers’ language strategies while interacting with their toddlers with expressive language delay. Methods: In a single-subject multiple-baseline across-behavior study, the mothers of 3 toddlers with expressive langu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005